The following function returns the sum of the elements of a vector.
[[cpp11::register]]doublesum2_cpp_(doubles x,bool na_rm =false){int n = x.size();double total =0;for(int i =0; i < n;++i){if(na_rm && ISNAN(x[i])){continue;}else{ total += x[i];}}return total;}
I also have to add the corresponding auxiliary function for the documentation.
#' Return the sum of the coordinates of a vector (C++)#' @inheritParams sum_r#' @param na_rm logical. Should missing values (including `NaN`) be removed?#' @exportsum2_cpp <-function(x, na_rm =FALSE) {sum2_cpp_(as.double(x), na_rm = na_rm)}
To test, I run the following lines in the R console.
library(bench)set.seed(123) # for reproducibilityx <-runif(1e3) # 1,000,000 elementsx[sample(1:1e3, 1e2)] <-NA# change some elements to NA at randomsum(x, na.rm =FALSE)
This function returns the average (or mean) of the elements of a vector.
[[cpp11::register]]doublemean2_cpp_(doubles x,bool na_rm =false){int n = x.size();int m =0;for(int i =0; i < n;++i){if(na_rm && ISNAN(x[i])){continue;}else{++m;}}if(m ==0){return NA_REAL;}double y =0;for(int i =0; i < n;++i){if(na_rm && ISNAN(x[i])){continue;}else{ y += x[i];}}return y / m;}
I also need to add the corresponding auxiliary function for the documentation.
#' Return the mean of the coordinates of a vector (C++)#' @inheritParams sum2_cpp#' @exportmean2_cpp <-function(x, na_rm =FALSE) {mean2_cpp_(as.double(x), na_rm = na_rm)}
A benchmark of the two functions is the following.
This function returns the variance of the elements of a vector.
[[cpp11::register]]doublevar2_cpp_(doubles x,bool na_rm =false){int n = x.size();if(n <2){return NA_REAL;}int m =0;for(int i =0; i < n;++i){if(na_rm && ISNAN(x[i])){continue;}else{++m;}}if(m <2){return NA_REAL;}double ex =0;for(int i =0; i < n;++i){if(na_rm && ISNAN(x[i])){continue;}else{ ex += x[i];}} ex /= m;double out =0;for(int i =0; i < n;++i){if(na_rm && ISNAN(x[i])){continue;}else{ out += pow(x[i]- ex,2);}}return out /(m -1);}
I also need to add the corresponding auxiliary function for the documentation.
#' Return the variance of the coordinates of a vector (C++)#' @inheritParams sum2_cpp#' @exportvar2_cpp <-function(x, na_rm =FALSE) {var2_cpp_(as.double(x), na_rm = na_rm)}
A benchmark of the two functions is the following.
The next function returns the measure of the differences between the observed values or an estimator (x1, x2, …, xn) and the true value (x0). For example, x1, …, xn could be experimental averages and x0 the true average.
[[cpp11::register]]doublermse2_cpp_(doubles x,double x0){int n = x.size();int m =0;for(int i =0; i < n;++i){if(ISNAN(x[i])){continue;}else{++m;}}if(m ==0){return NA_REAL;}double out;for(int i =0; i < n;++i){if(na_rm && ISNAN(x[i])){continue;}else{ out += pow(x[i]- x0,2.0);}}return sqrt(out / m);}
I also need to add the corresponding auxiliary function for the documentation.
#' Return the root mean square error (C++)#' @inheritParams rmse_r#' @param na_rm logical. Should missing values (including `NaN`) be removed?#' @exportrmse2_cpp <-function(x, x0, na_rm =FALSE) {rmse2_cpp_(as.double(x), as.double(x0), na_rm = na_rm)}
A benchmark of the base R versus C++ implementation is the following.
# create a list with 100 normal distributions with mean 0 and 1 million elements# eachset.seed(123) # for reproducibilityx <-list()for (i in1:1e3) { x[[i]] <-rnorm(1e3)}# compute the mean of each distributionx <-sapply(x, mean)# remove some elements at randomx[sample(1:1e3, 1e2)] <-NArmse2_cpp(x, 0)